Public Internet Correlation between Population Density or Schools

DSAN 6750 / PPOL 6805: GIS for Spatial Data Science

Author
Affiliation

Gabriel Soto

Georgetown University

Introduction

Do public schools get better access to public internet? This research analyses whether public schools in Panama, get access to public internet via access points - Google Satellite - Migration Panama

  • meter data temporal de uso de datos por access points
  • agregar datos de cantidad de estudiantes por distrito https://www.inec.gob.pa/archivos/P030194820231213142523Cuadro%2021.pdf
  • agregar un indice normalizado (ya que el numero es muy chiquito) de la cantidad de casas sin internet https://www.inec.gob.pa/archivos/P0705547520240202111515Cuadro%201.pdf

Future: - indicadores socioeconomicos a nivel de distrito https://www.inec.gob.pa/archivos/P0579518620240202083001Cuadro%204.pdf

dijo:

It’s the remoteness of Oyala that makes it so appealing to President Obiang. In a rare interview he described how rebels had recently plotted a seaborne assault on his palace in the current capital, Malabo. ‘We need a secure place for my government and for future governments. That’s why we have created Oyala, to guarantee the government of Equatorial Guinea.’ (Sackur 2012)

This case is far from exceptional, as an even more recent Washington Post article points out with respect to Myanmar’s decision to move its capital from Yangon to Naypyidaw:

Analysts have described the decision as motivated by a desire to secure the military’s seat of power from any threat of protests or invasions. (Berger 2021)

Most of these studies, however, are based on observations of conflict events. In this study, we study the more fundamental variable of a capital’s distance from the population centroid of the country.

Literature Review

Campante, Do, and Guimaraes (2019) analyzes the relationship between the location of a capital city and the degree of conflict and misgovernance in a given country. Their two key findings are that:

Conflict is more likely to emerge (and dislodge incumbents) closer to the capital

and

Isolated capitals are associated with misgovernance.

This first finding is illustrated in ?@fig-conflict-dist

Code
#library
library(sf)
Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
Code
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
Code
library(stringr)
library(ggplot2)
library(tmap)
Warning: package 'tmap' was built under R version 4.4.2
Breaking News: tmap 3.x is retiring. Please test v4, e.g. with
remotes::install_github('r-tmap/tmap')
Code
library(readr)
library(DT)
Warning: package 'DT' was built under R version 4.4.2
Code
library(skimr)
Warning: package 'skimr' was built under R version 4.4.2
Code
library(plotly)

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout
Code
library(leaflet)
Code
#functions
convert_dms_to_decimal <- function(dms_str) {
  # Extract components
  parts <- str_match(dms_str, "(\\d+)° (\\d+)' (\\d+\\.?\\d*)\"\" ([N|S|E|W])")
  
  # Convert components to numeric
  degrees <- as.numeric(parts[, 2])
  minutes <- as.numeric(parts[, 3])
  seconds <- as.numeric(parts[, 4])
  
  # Calculate decimal degrees
  decimal <- degrees + minutes/60 + seconds/3600
  
  # Apply sign based on direction
  direction <- parts[, 5]
  decimal <- ifelse(direction %in% c("S", "W"), -decimal, decimal)
  
  return(decimal)
}
Code
#schools data
schools_csv <- read.csv("data-spatial/schools-data.csv")

#district data
district_path <- "data-spatial/panama-districts/gadm41_PAN_2.json"
# Read GeoJSON data
district_data <- st_read(district_path, quiet = TRUE)
# Fix any invalid geometries
district_data <- st_make_valid(district_data)

#Panama highway
# Read GeoJSON data
highway_data <- st_read("data-spatial/panam-highway.geojson", quiet = TRUE)

#access points data
ap_csv <- read.csv("data-spatial/access-points.csv", fileEncoding="latin1")
ap_data <- st_as_sf(ap_csv, 
                      coords = c("long", "lat"),
                      crs = 4326)
Code
#Cleaning data

# Convert your coordinates
schools_csv$decimal_lat <- convert_dms_to_decimal(schools_csv$lat)
schools_csv$decimal_long <- convert_dms_to_decimal(schools_csv$long)

#removing long, lat columns and creating schools_df
schools_df <- schools_csv %>% select(-lat) %>% select(-long)

# Create SF object with converted coordinates from schools_df
schools_data <- st_as_sf(schools_df, 
                      coords = c("decimal_long", "decimal_lat"),
                      crs = 4326)
Code
rm(schools_csv, district_path, ap_csv, schools_df)
Code
#province summary
province_summary <- schools_data %>%
  st_drop_geometry() %>%
  group_by(province) %>%
  summarise(schools = n()) %>%
  full_join(
    ap_data %>%
      st_drop_geometry() %>%
      group_by(province) %>%
      summarise(access_points = n()),
    by = "province"
  ) %>%
  full_join(
    district_data %>%
      st_drop_geometry() %>%
      group_by(NAME_1) %>%
      summarise(population = sum(population)),
    by = c("province" = "NAME_1")
  ) %>%
  mutate(
    ap_per_school = round(access_points/schools, 2),
    ap_per_1000_people = round((access_points/population) * 1000, 2)
  )


#district summary
district_summary <- schools_data %>%
  st_drop_geometry() %>%  # Remove spatial component
  group_by(district) %>%
  summarise(schools = n()) %>%
  full_join(
    ap_data %>%
      st_drop_geometry() %>%  # Remove spatial component
      group_by(district) %>%
      summarise(access_points = n()),
    by = "district"
  ) %>%
  full_join(
    district_data %>%
      st_drop_geometry() %>%
      select(NAME_2, population),
    by = c("district" = "NAME_2")
  ) %>%
  mutate(
    access_points = coalesce(access_points, 0),  # Replace NA with 0
    ap_per_school = round(access_points/schools, 2),
    ap_per_1000_people = round((access_points/population) * 1000, 2)  
  )

district_summary <- district_summary %>%
  filter(!is.na(population), !is.na(access_points)) %>%
  arrange(district) 


datatable(province_summary,
          colnames = c("Province", "Number of Schools", "Number of Access Points", "Population", "AP per School", "AP per 1000 people"),
          options = list(pageLength = 20))
Code
datatable(district_summary,
          colnames = c("District", "Number of Schools", "Number of Access Points", "Population", "AP per School", "AP per 1000 people"),
          options = list(pageLength = 20))
Code
province_summary
province schools access_points population ap_per_school ap_per_1000_people
Bocas del Toro 166 54 113871.0 0.33 0.47
Chiriqui 460 227 447546.0 0.49 0.51
Cocle 343 160 268264.0 0.47 0.60
Colon 201 70 278395.0 0.35 0.25
Comarca Embera 39 3 12358.0 0.08 0.24
Comarca Kuna Yala 48 7 32016.0 0.15 0.22
Comarca Ngobe Bugle 307 23 129165.5 0.07 0.18
Darien 154 36 34506.0 0.23 1.04
Herrera 200 65 117786.0 0.32 0.55
Los Santos 176 70 98466.0 0.40 0.71
Panama 640 377 1439575.0 0.59 0.26
Veraguas 500 110 236944.2 0.22 0.46
Panama Oeste NA 103 653665.0 NA 0.16
Code
district_summary
district schools access_points population ap_per_school ap_per_1000_people
Aguadulce 32 30 49005.0 0.94 0.61
Alanje 22 7 18877.0 0.32 0.37
Anton 46 27 59194.0 0.59 0.46
Arraijan 42 43 299079.0 1.02 0.14
Atalaya 14 3 17507.0 0.21 0.17
Balboa 9 0 1989.0 0.00 0.00
Baru 83 38 56307.0 0.46 0.67
Besiko 34 1 31422.0 0.03 0.03
Bocas del Toro 35 6 285.0 0.17 21.05
Boqueron 24 8 21001.0 0.33 0.38
Boquete 20 7 23562.0 0.35 0.30
Bugaba 79 45 68870.0 0.57 0.65
Calobre 45 2 11666.0 0.04 0.17
Capira 78 14 45629.0 0.18 0.31
Cañazas 63 1 16933.0 0.02 0.06
Cemaco 28 3 9547.0 0.11 0.31
Chagres 37 3 10968.0 0.08 0.27
Chame 26 12 28535.0 0.46 0.42
Changuinola 109 41 101091.0 0.38 0.41
Chepigana 114 22 12983.0 0.19 1.69
Chepo 105 32 65588.0 0.30 0.49
Chiman 17 0 3142.0 0.00 0.00
Chiriqui Grande 22 7 12495.0 0.32 0.56
Chitre 17 33 60957.0 1.94 0.54
Colon 89 57 240722.0 0.64 0.24
Comarca Kuna Yala 48 7 32016.0 0.15 0.22
David 77 58 156498.0 0.75 0.37
Dolega 26 17 37678.0 0.65 0.45
Donoso 52 3 12274.0 0.06 0.24
Gualaca 23 5 9831.0 0.22 0.51
Guarare 17 11 12107.0 0.65 0.91
Kankintu 68 2 19751.0 0.03 0.10
Kusapin 50 3 17047.0 0.06 0.18
La Chorrera 73 26 258221.0 0.36 0.10
La Mesa 29 8 12238.0 0.28 0.65
La Pintada 66 22 29698.0 0.33 0.74
Las Minas 43 2 6642.0 0.05 0.30
Las Palmas 53 4 1015.2 0.08 3.94
Las Tablas 39 20 30440.0 0.51 0.66
Los Pozos 36 4 6928.0 0.11 0.58
Los Santos 28 16 30028.0 0.57 0.53
Macaracas 34 4 8965.0 0.12 0.45
Mirono 32 1 21800.0 0.03 0.05
Montijo 43 3 6784.0 0.07 0.44
Muna 70 8 799.5 0.11 10.01
Nata 30 12 19741.0 0.40 0.61
Nole Duima 20 4 20709.0 0.20 0.19
Ocu 50 11 16116.0 0.22 0.68
Ola 22 4 6300.0 0.18 0.63
Panama 207 278 1086990.0 1.34 0.26
Parita 13 3 9695.0 0.23 0.31
Pedasi 9 11 4942.0 1.22 2.23
Penonome 147 65 104326.0 0.44 0.62
Pese 31 7 8724.0 0.23 0.80
Pinogana 40 14 21523.0 0.35 0.65
Pocri 14 5 3025.0 0.36 1.65
Portobelo 13 4 10320.0 0.31 0.39
Remedios 5 4 4388.0 0.80 0.91
Renacimiento 38 20 22429.0 0.53 0.89
Rio de Jesus 13 3 4822.0 0.23 0.62
Sambu 11 0 2811.0 0.00 0.00
San Carlos 25 8 22201.0 0.32 0.36
San Felix 8 6 6881.0 0.75 0.87
San Francisco 30 4 10107.0 0.13 0.40
San Lorenzo 16 7 8031.0 0.44 0.87
San Miguelito 56 65 280777.0 1.16 0.23
Santa Fe 39 4 18023.0 0.10 0.22
Santa Isabel 10 3 4111.0 0.30 0.73
Santa Maria 10 5 8724.0 0.50 0.57
Santiago 91 51 109605.0 0.56 0.47
Sona 78 23 28244.0 0.29 0.81
Taboga 2 2 1089.0 1.00 1.84
Tole 39 5 13193.0 0.13 0.38
Tonosi 35 3 8959.0 0.09 0.33
Ñurum 32 4 17637.0 0.12 0.23
Code
district_summary <- district_summary %>%
  filter(!is.na(population), !is.na(access_points)) %>%
  arrange(district) 


plot_ly(district_summary, 
        x = ~population, 
        y = ~access_points,
        type = 'scatter',
        mode = 'markers',
        marker = list(
          size = 10,
          color = 'rgb(49,130,189)',
          opacity = 0.7,
          line = list(color = 'rgb(27,71,105)', width = 1)
        ),
        text = ~district,
        hovertemplate = paste(
          '<b>District:</b> %{text}<br>',
          '<b>Population:</b> %{x:,}<br>',
          '<b>Access Points:</b> %{y}<br>',
          '<b>Schools:</b> ', district_summary$schools, '<br>',
          '<b>AP per School:</b> ', district_summary$ap_per_school,
          '<extra></extra>'
        )
) %>%
  add_lines(x = ~population,
            y = ~fitted(lm(access_points ~ population, data = district_summary)),
            line = list(color = 'rgb(219,64,82)',
                       width = 2,
                       dash = 'dash'),
            showlegend = TRUE,
            name = 'Trend line') %>%
  layout(
    title = list(
      text = "Access Points vs Population by District",
      font = list(size = 24)
    ),
    xaxis = list(
      title = "Population",
      gridcolor = 'rgb(255,255,255)',
      gridwidth = 2,
      showgrid = TRUE,
      zeroline = FALSE
    ),
    yaxis = list(
      title = "Number of Access Points",
      gridcolor = 'rgb(255,255,255)',
      gridwidth = 2,
      showgrid = TRUE,
      zeroline = FALSE
    ),
    paper_bgcolor = 'rgb(251,251,251)',
    plot_bgcolor = 'rgb(251,251,251)',
    showlegend = TRUE
  )
A marker object has been specified, but markers is not in the mode
Adding markers to the mode...
Code
library(plotly)

# Create side-by-side plots
subplot(
  # Left plot: Access Points vs Population
  plot_ly(district_summary, 
          x = ~population, 
          y = ~access_points,
          type = 'scatter',
          mode = 'markers',
          marker = list(
            size = 10,
            color = 'rgb(49,130,189)',
            opacity = 0.7,
            line = list(color = 'rgb(27,71,105)', width = 1)
          ),
          text = ~district,
          hovertemplate = paste(
            '<b>District:</b> %{text}<br>',
            '<b>Population:</b> %{x:,}<br>',
            '<b>Access Points:</b> %{y}<br>',
            '<extra></extra>'
          )
  ) %>%
    add_lines(x = ~population,
              y = ~fitted(lm(access_points ~ population, data = district_summary)),
              line = list(color = 'rgb(219,64,82)',
                         width = 2,
                         dash = 'dash'),
              showlegend = TRUE,
              name = 'Access Points Trend') %>%
    layout(
      title = "Access Points vs Population",
      xaxis = list(
        title = "Population",
        gridcolor = 'rgb(255,255,255)',
        gridwidth = 2,
        showgrid = TRUE,
        zeroline = FALSE
      ),
      yaxis = list(
        title = "Number of Access Points",
        gridcolor = 'rgb(255,255,255)',
        gridwidth = 2,
        showgrid = TRUE,
        zeroline = FALSE
      ),
      paper_bgcolor = 'rgb(251,251,251)',
      plot_bgcolor = 'rgb(251,251,251)'
    ),
  
  # Right plot: Access Points per School
  plot_ly(district_summary, 
          x = ~population, 
          y = ~ap_per_school,
          type = 'scatter',
          mode = 'markers',
          marker = list(
            size = 10,
            color = 'rgb(50,205,50)',
            opacity = 0.7,
            line = list(color = 'rgb(0,100,0)', width = 1)
          ),
          text = ~district,
          hovertemplate = paste(
            '<b>District:</b> %{text}<br>',
            '<b>Population:</b> %{x:,}<br>',
            '<b>Access Points per School:</b> %{y:.2f}<br>',
            '<b>Total Schools:</b> ', district_summary$schools, '<br>',
            '<extra></extra>'
          )
  ) %>%
    add_lines(x = ~population,
              y = ~fitted(lm(ap_per_school ~ population, data = district_summary)),
              line = list(color = 'rgb(255,69,0)',
                         width = 2,
                         dash = 'dash'),
              showlegend = TRUE,
              name = 'AP per School Trend') %>%
    layout(
      title = "Access Points per School vs Population",
      xaxis = list(
        title = "Population",
        gridcolor = 'rgb(255,255,255)',
        gridwidth = 2,
        showgrid = TRUE,
        zeroline = FALSE
      ),
      yaxis = list(
        title = "Access Points per School",
        gridcolor = 'rgb(255,255,255)',
        gridwidth = 2,
        showgrid = TRUE,
        zeroline = FALSE
      ),
      paper_bgcolor = 'rgb(251,251,251)',
      plot_bgcolor = 'rgb(251,251,251)'
    ),
  
  # Subplot layout
  nrows = 1,
  shareX = TRUE,
  titleX = TRUE
)
A marker object has been specified, but markers is not in the mode
Adding markers to the mode...
A marker object has been specified, but markers is not in the mode
Adding markers to the mode...
Code
# Run linear regression
model <- lm(access_points ~ population, data = district_summary)

# View summary of regression results
summary(model)

Call:
lm(formula = access_points ~ population, data = district_summary)

Residuals:
    Min      1Q  Median      3Q     Max 
-41.413  -4.688  -2.250   3.524  34.859 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 4.874e+00  1.302e+00   3.743  0.00036 ***
population  2.422e-04  8.992e-06  26.934  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.54 on 73 degrees of freedom
Multiple R-squared:  0.9086,    Adjusted R-squared:  0.9073 
F-statistic: 725.5 on 1 and 73 DF,  p-value: < 2.2e-16
Code
# For a more detailed analysis, we can also include additional variables
model_extended <- lm(access_points ~ population + schools, data = district_summary)
summary(model_extended)

Call:
lm(formula = access_points ~ population + schools, data = district_summary)

Residuals:
    Min      1Q  Median      3Q     Max 
-40.368  -3.912   0.572   4.103  19.345 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -2.568e+00  1.772e+00  -1.449    0.152    
population   2.058e-04  1.021e-05  20.171  < 2e-16 ***
schools      2.162e-01  4.019e-02   5.379 8.89e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 8.962 on 72 degrees of freedom
Multiple R-squared:  0.9348,    Adjusted R-squared:  0.933 
F-statistic:   516 on 2 and 72 DF,  p-value: < 2.2e-16
Code
# Create the map
ap_map <- leaflet(ap_data) %>%
  # Add base map tiles
  addProviderTiles(providers$CartoDB.Positron) %>%
  # Add access points
  addCircleMarkers(
    radius = 5,
    color = "blue",
    fillColor = "blue",
    fillOpacity = 0.6,
    stroke = TRUE,
    weight = 1,
    popup = ~paste(
      "<strong>Location:</strong>", name, "<br>",
      "<strong>Province:</strong>", province, "<br>",
      "<strong>District:</strong>", district, "<br>",
      "<strong>County:</strong>", county, "<br>",
      "<strong>Type:</strong>", type, "<br>",
      "<strong>Date:</strong>", date
    ),
    # Enable clustering to handle many points
    clusterOptions = markerClusterOptions()
  ) %>%
  # Set the initial view to Panama
  setView(lng = -80.782127, lat = 8.537981, zoom = 7)

ap_map
Code
district_data_merged <- district_data %>%
  left_join(district_summary, by = c("NAME_2" = "district"))


breaks <- c(1, 10, 50, 100, 200, max(district_data_merged$access_points))
pal <- colorBin(
  palette = "RdYlGn",
  domain = district_data_merged$access_points,
  bins = breaks,
  reverse = FALSE
)

# Create the map
leaflet(district_data_merged) %>%
  addProviderTiles(providers$CartoDB.DarkMatter) %>%
  addPolygons(
    fillColor = ~pal(access_points),
    fillOpacity = 0.7,
    weight = 1,
    color = "#666",
    highlightOptions = highlightOptions(
      weight = 2,
      color = "#666",
      fillOpacity = 0.9,
      bringToFront = TRUE
    ),
    popup = ~paste(
      "<strong>District:</strong>", NAME_2, "<br>",
      "<strong>Province:</strong>", NAME_1, "<br>",
      "<strong>Access Points:</strong>", access_points, "<br>",
      "<strong>Schools:</strong>", schools
    )
  ) %>%
  addLegend(
    position = "bottomright",
    pal = pal,
    values = ~access_points,
    title = "Number of Access Points",
    opacity = 0.7,
    labFormat = labelFormat(digits = 0)  # Show integers only
  ) %>%
  setView(lng = -80.782127, lat = 8.537981, zoom = 7)
Warning in pal(access_points): Some values were outside the color scale and
will be treated as NA
Code
library(osmdata)
Warning: package 'osmdata' was built under R version 4.4.2
Data (c) OpenStreetMap contributors, ODbL 1.0. https://www.openstreetmap.org/copyright
Code
library(sf)

panama_bb <- getbb("Panama")
highway <- opq(panama_bb) %>%
  add_osm_feature(key = "name", value = "Carretera Panamericana") %>%
  osmdata_sf()

leaflet() %>%
  addProviderTiles(providers$CartoDB.DarkMatter) %>%
  addPolylines(
    data = highway$osm_lines,  # Using the lines from your downloaded data
    color = "red",
    weight = 2,
    opacity = 0.8,
    popup = ~name  # This will show the road name when clicked
  ) %>%
  setView(lng = -80.782127, lat = 8.537981, zoom = 7)
Code
pal <- colorFactor(
  palette = "Set3",
  domain = unique(schools_data$type)
)

leaflet() %>%
  # Add a base map (you can change the provider as we discussed)
  addProviderTiles(providers$CartoDB.DarkMatter) %>%
  # Add school points with clustering
  addCircleMarkers(
    data = schools_data,
    radius = 6,
    color = ~pal(type),
    fillOpacity = 1,
    stroke = TRUE,
    weight = 1,
    popup = ~paste(
      "<strong>School:</strong>", name, "<br>",
      "<strong>Type:</strong>", type, "<br>",
      "<strong>Province:</strong>", province, "<br>",
      "<strong>District:</strong>", district, "<br>",
      "<strong>County:</strong>", county
    ),
    clusterOptions = markerClusterOptions()
  ) %>%
  addLegend(
    position = "bottomright",
    pal = pal,
    values = schools_data$type,
    title = "School Type"
  ) %>%
  # Set the initial view to Panama
  setView(lng = -80.782127, lat = 8.537981, zoom = 7)
Code
districts_with_highway <- st_intersection(district_data, highway$osm_lines)
Warning: attribute variables are assumed to be spatially constant throughout
all geometries
Code
highway_districts <- unique(districts_with_highway$NAME_2)


leaflet() %>%
  addProviderTiles(providers$CartoDB.DarkMatter) %>%
  # Add all districts in light gray
  addPolygons(
    data = district_data,
    fillColor = "lightgray",
    fillOpacity = 0.3,
    weight = 1,
    color = "gray"
  ) %>%
  # Highlight districts with highway in blue
  addPolygons(
    data = district_data[district_data$NAME_2 %in% highway_districts,],
    fillColor = "yellow",
    fillOpacity = 0.3,
    weight = 1,
    color = "blue",
    popup = ~NAME_2
  ) %>%
  # Add the highway in red
  addPolylines(
    data = highway$osm_lines,
    color = "red",
    weight = 2
  ) %>%
  setView(lng = -80.782127, lat = 8.537981, zoom = 7)
Code
library(spdep)
Warning: package 'spdep' was built under R version 4.4.2
Loading required package: spData
To access larger datasets in this package, install the spDataLarge
package with: `install.packages('spDataLarge',
repos='https://nowosad.github.io/drat/', type='source')`
Code
district_data_complete <- merge(district_data, 
                              district_summary, 
                              by.x = "NAME_2", 
                              by.y = "district")
# 1. Create a neighbors list based on district boundaries
# Queens case (districts sharing any boundary point)
nb <- poly2nb(district_data_complete, queen = TRUE)
Warning in poly2nb(district_data_complete, queen = TRUE): some observations have no neighbours;
if this seems unexpected, try increasing the snap argument.
Warning in poly2nb(district_data_complete, queen = TRUE): neighbour object has 3 sub-graphs;
if this sub-graph count seems unexpected, try increasing the snap argument.
Code
# 2. Create spatial weights
w <- nb2listw(nb, style = "W", zero.policy = TRUE)

# Now calculate Moran's I with zero.policy=TRUE
moran_test <- moran.test(district_data_complete$access_points, w, zero.policy = TRUE)
print(moran_test)

    Moran I test under randomisation

data:  district_data_complete$access_points  
weights: w  
n reduced by no-neighbour observations  

Moran I statistic standard deviate = 4.7929, p-value = 8.218e-07
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
      0.229408902      -0.013888889       0.002576752 
Code
# For visualization
moran.plot(district_data_complete$access_points, w, 
          labels = district_data_complete$NAME_2,
          xlab = "Access Points",
          ylab = "Spatially Lagged Access Points",
          zero.policy = TRUE)

Code
# Calculate local Moran's I
local_moran <- localmoran(district_data_complete$access_points, w, zero.policy = TRUE)

# Add results to your spatial dataframe
district_data_complete$local_moran <- local_moran[, 1]
district_data_complete$local_moran_p <- local_moran[, 5]

# Map it
leaflet(district_data_complete) %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(
    fillColor = ~colorNumeric("RdYlBu", local_moran)(local_moran),
    fillOpacity = 0.7,
    weight = 1,
    color = "white",
    popup = ~paste(
      "<strong>District:</strong>", NAME_2,
      "<br><strong>Local Moran's I:</strong>", round(local_moran, 3),
      "<br><strong>Access Points:</strong>", access_points
    )
  ) %>%
  addLegend(
    position = "bottomright",
    pal = colorNumeric("RdYlBu", local_moran[,1]),
    values = local_moran[,1],
    title = "Local Moran's I"
  )
Code
library(spdep)

# 1. Access Points and Schools relationship
# Standardize the variables
z_ap <- scale(district_data_complete$access_points)
z_schools <- scale(district_data_complete$schools)

# Calculate Moran's I for Access Points vs Schools
moran_ap_schools <- moran.test(district_data_complete$access_points, 
                              nb2listw(nb, style="W", zero.policy=TRUE), 
                              zero.policy=TRUE)

print("Moran's I for Access Points vs Schools:")
[1] "Moran's I for Access Points vs Schools:"
Code
print(moran_ap_schools)

    Moran I test under randomisation

data:  district_data_complete$access_points  
weights: nb2listw(nb, style = "W", zero.policy = TRUE)  
n reduced by no-neighbour observations  

Moran I statistic standard deviate = 4.7929, p-value = 8.218e-07
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
      0.229408902      -0.013888889       0.002576752 
Code
# 2. Access Points and Density relationship
# Standardize density
z_density <- scale(district_data_complete$density)

# Calculate Moran's I for Access Points vs Density
moran_ap_density <- moran.test(district_data_complete$density, 
                              nb2listw(nb, style="W", zero.policy=TRUE), 
                              zero.policy=TRUE)

print("Moran's I for Access Points vs Density:")
[1] "Moran's I for Access Points vs Density:"
Code
print(moran_ap_density)

    Moran I test under randomisation

data:  district_data_complete$density  
weights: nb2listw(nb, style = "W", zero.policy = TRUE)  
n reduced by no-neighbour observations  

Moran I statistic standard deviate = 5.813, p-value = 3.068e-09
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
     0.1008972152     -0.0138888889      0.0003899232 
Code
# Create visualization plots
par(mfrow=c(1,2))  # Set up a 1x2 plotting area

# Plot for Access Points vs Schools
moran.plot(district_data_complete$access_points, 
          nb2listw(nb, style="W", zero.policy=TRUE),
          labels=district_data_complete$NAME_2,
          xlab="Access Points",
          ylab="Spatially Lagged Access Points",
          main="Access Points vs Schools",
          zero.policy=TRUE)

# Plot for Access Points vs Density
moran.plot(district_data_complete$density, 
          nb2listw(nb, style="W", zero.policy=TRUE),
          labels=district_data_complete$NAME_2,
          xlab="Population Density",
          ylab="Spatially Lagged Density",
          main="Access Points vs Density",
          zero.policy=TRUE)

Code
# Calculate ratio of access points to density
district_data_complete$ap_per_density <- district_data_complete$access_points / district_data_complete$density

# Calculate Moran's I
moran_density <- moran.test(district_data_complete$ap_per_density, w, zero.policy = TRUE)
print("Moran's I for Access Points relative to Population Density:")
[1] "Moran's I for Access Points relative to Population Density:"
Code
print(moran_density)

    Moran I test under randomisation

data:  district_data_complete$ap_per_density  
weights: w  
n reduced by no-neighbour observations  

Moran I statistic standard deviate = 4.3008, p-value = 8.51e-06
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
      0.208806834      -0.013888889       0.002681206 
Code
# Calculate ratio of access points to schools
district_data_complete$ap_per_school <- district_data_complete$access_points / district_data_complete$schools

# Calculate Moran's I
moran_schools <- moran.test(district_data_complete$ap_per_school, w, zero.policy = TRUE)
print("Moran's I for Access Points relative to Schools:")
[1] "Moran's I for Access Points relative to Schools:"
Code
print(moran_schools)

    Moran I test under randomisation

data:  district_data_complete$ap_per_school  
weights: w  
n reduced by no-neighbour observations  

Moran I statistic standard deviate = 3.1227, p-value = 0.0008959
alternative hypothesis: greater
sample estimates:
Moran I statistic       Expectation          Variance 
      0.229109922      -0.013888889       0.006055367 
Code
# Set up to display two Moran plots side by side
par(mfrow=c(1,2))

# Plot for Access Points per Density
moran.plot(district_data_complete$ap_per_density, 
         w,
         labels=district_data_complete$NAME_2,
         xlab="Access Points per Density",
         ylab="Spatially Lagged AP per Density",
         main="Access Points per Density",
         zero.policy=TRUE)

# Plot for Access Points per School
moran.plot(district_data_complete$ap_per_school, 
         w,
         labels=district_data_complete$NAME_2,
         xlab="Access Points per School",
         ylab="Spatially Lagged AP per School",
         main="Access Points per School",
         zero.policy=TRUE)

Code
# Reset plotting parameters
par(mfrow=c(1,1))

# Let's also create choropleth maps to visualize these ratios
# Map for AP per Density
leaflet() %>%
 addProviderTiles(providers$CartoDB.Positron) %>%
 # First map: AP per Density
 addPolygons(
   data = district_data_complete,
   fillColor = ~colorNumeric("viridis", ap_per_density)(ap_per_density),
   fillOpacity = 0.7,
   weight = 1,
   color = "white",
   popup = ~paste(
     "<strong>District:</strong>", NAME_2,
     "<br><strong>AP per Density:</strong>", round(ap_per_density, 4)
   )
 ) 
Code
# Map for AP per School
leaflet() %>%
 addProviderTiles(providers$CartoDB.Positron) %>%
 # Second map: AP per School
 addPolygons(
   data = district_data_complete,
   fillColor = ~colorNumeric("viridis", ap_per_school)(ap_per_school),
   fillOpacity = 0.7,
   weight = 1,
   color = "white",
   popup = ~paste(
     "<strong>District:</strong>", NAME_2,
     "<br><strong>AP per School:</strong>", round(ap_per_school, 4)
   )
 ) 
Code
library(spatstat)
Warning: package 'spatstat' was built under R version 4.4.2
Loading required package: spatstat.data
Warning: package 'spatstat.data' was built under R version 4.4.2
Loading required package: spatstat.univar
Warning: package 'spatstat.univar' was built under R version 4.4.2
spatstat.univar 3.1-1
Loading required package: spatstat.geom
Warning: package 'spatstat.geom' was built under R version 4.4.2
spatstat.geom 3.3-4
Loading required package: spatstat.random
Warning: package 'spatstat.random' was built under R version 4.4.2
spatstat.random 3.3-2
Loading required package: spatstat.explore
Warning: package 'spatstat.explore' was built under R version 4.4.2
Loading required package: nlme

Attaching package: 'nlme'
The following object is masked from 'package:dplyr':

    collapse
spatstat.explore 3.3-3
Loading required package: spatstat.model
Warning: package 'spatstat.model' was built under R version 4.4.2
Loading required package: rpart
spatstat.model 3.3-3
Loading required package: spatstat.linnet
Warning: package 'spatstat.linnet' was built under R version 4.4.2
spatstat.linnet 3.2-3

spatstat 3.3-0 
For an introduction to spatstat, type 'beginner' 
Code
library(sf)

# First, transform both datasets to UTM Zone 17N (EPSG:32617)
schools_projected <- st_transform(schools_data, 32617)
ap_projected <- st_transform(ap_data, 32617)
district_projected <- st_transform(district_data, 32617)

# Create the observation window
window <- as.owin(st_union(district_projected))

# Create point pattern objects for both schools and access points
schools_coords <- st_coordinates(schools_projected)
ap_coords <- st_coordinates(ap_projected)

# Create a marked point pattern with both types of points
points_combined <- ppp(
  x = c(schools_coords[,1], ap_coords[,1]),
  y = c(schools_coords[,2], ap_coords[,2]),
  marks = factor(c(rep("school", nrow(schools_coords)),
                  rep("ap", nrow(ap_coords)))),
  window = window
)
Warning: 49 points were rejected as lying outside the specified window
Warning: data contain duplicated points
Code
# Calculate cross-type pair correlation function
cross_pcf <- pcfcross(points_combined, "school", "ap", correction = "translate")

# Plot the cross-type pair correlation function
plot(cross_pcf, main = "Cross-type PCF: Schools vs Access Points",
     xlab = "Distance (m)", ylab = "g(r)")
abline(h = 1, lty = 2)  # Add reference line at g(r) = 1

Code
# Calculate cross-type K function
cross_K <- Kcross(points_combined, "school", "ap", correction = "border")

# Plot the cross-type K function
plot(cross_K, main = "Cross-type K function: Schools vs Access Points",
     xlab = "Distance (m)")

Code
# Calculate cross-type L function
cross_L <- Lcross(points_combined, "school", "ap", correction = "border")

# Plot the cross-type L function
plot(cross_L, main = "Cross-type L function: Schools vs Access Points",
     xlab = "Distance (m)")

Code
library(sf)

# First project to UTM for accurate buffer creation
highway_projected <- st_transform(highway$osm_lines, 32617)  # UTM 17N
ap_projected <- st_transform(ap_data, 32617)

# Create buffers in UTM
buffer_5km <- st_buffer(highway_projected, 5000)
buffer_10km <- st_buffer(highway_projected, 10000)

# Transform everything back to WGS84 for mapping
highway_wgs84 <- st_transform(highway_projected, 4326)
ap_wgs84 <- st_transform(ap_projected, 4326)
buffer_5km_wgs84 <- st_transform(buffer_5km, 4326)
buffer_10km_wgs84 <- st_transform(buffer_10km, 4326)

# Create the map
leaflet() %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  # Add the buffers
  addPolygons(
    data = buffer_10km_wgs84,
    fillColor = "yellow",
    fillOpacity = 0.2,
    color = "yellow",
    weight = 1
  ) %>%
  addPolygons(
    data = buffer_5km_wgs84,
    fillColor = "blue",
    fillOpacity = 0.2,
    color = "blue",
    weight = 1
  ) %>%
  # Add the highway
  addPolylines(
    data = highway_wgs84,
    color = "black",
    weight = 2
  ) %>%
  # Add access points
  addCircleMarkers(
    data = ap_wgs84,
    radius = 3,
    color = "red",
    fillOpacity = 0.7
  ) %>%
  # Set the view to Panama
  setView(lng = -80.782127, lat = 8.537981, zoom = 7)
Code
# Calculate distances from each access point to the nearest part of the highway
distances <- st_distance(ap_projected, highway_projected)
min_distances <- apply(distances, 1, min)  # Get minimum distance for each point

# Convert to kilometers for easier interpretation
min_distances_km <- min_distances/1000

# Basic statistics
distance_stats <- summary(min_distances_km)
print("Summary statistics of distances (in km):")
[1] "Summary statistics of distances (in km):"
Code
print(distance_stats)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
  0.00033   1.54627   8.06498  18.26326  23.90969 112.14684 
Code
# Create a histogram of distances
hist(min_distances_km, 
     breaks = 30,
     main = "Distribution of Access Point Distances from Highway",
     xlab = "Distance (km)",
     ylab = "Number of Access Points")

Code
# We can also map this, coloring points by their distance to the highway
# Create a color palette based on distances
pal <- colorNumeric(
  palette = "YlOrRd",  # Yellow to Orange to Red
  domain = min_distances_km
)

# Create map
leaflet() %>%
  addProviderTiles(providers$CartoDB.DarkMatter) %>%
  # Add highway
  addPolylines(
    data = highway_wgs84,
    color = "red",
    weight = 2
  ) %>%
  # Add points colored by distance
  addCircleMarkers(
    data = ap_wgs84,
    radius = 3,
    color = ~pal(min_distances_km),
    popup = ~paste("Distance to highway:", round(min_distances_km, 2), "km"),
    fillOpacity = 0.7
  ) %>%
  # Add legend
  
  addLegend(
    position = "bottomright",
    pal = pal,
    values = min_distances_km,
    title = "Distance to Highway (km)"
  ) %>%
  setView(lng = -80.782127, lat = 8.537981, zoom = 7)
Code
# Calculate correlation between access points and schools considering spatial weights
ap_schools_correlation <- cor.test(
  district_data_complete$access_points,
  district_data_complete$schools
)

# And with density
ap_density_correlation <- cor.test(
  district_data_complete$access_points,
  district_data_complete$density
)

print("\nSpatial Correlations:")
[1] "\nSpatial Correlations:"
Code
print("Access Points vs Schools:")
[1] "Access Points vs Schools:"
Code
print(ap_schools_correlation)

    Pearson's product-moment correlation

data:  district_data_complete$access_points and district_data_complete$schools
t = 9.7615, df = 73, p-value = 7.053e-15
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6337455 0.8365652
sample estimates:
      cor 
0.7524744 
Code
print("\nAccess Points vs Density:")
[1] "\nAccess Points vs Density:"
Code
print(ap_density_correlation)

    Pearson's product-moment correlation

data:  district_data_complete$access_points and district_data_complete$density
t = 2.3802, df = 73, p-value = 0.01992
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.04408264 0.46688466
sample estimates:
      cor 
0.2683595 
Code
leaflet() %>%
  addProviderTiles(providers$CartoDB.Positron) %>%
  # Districts with detailed popups
  addPolygons(
    data = district_data_complete,
    weight = 1,
    color = "red",
    fillOpacity = 0.4,
    popup = ~paste(
      "<b>District:</b>", NAME_2,
      "<br><b>Province:</b>", NAME_1,
      "<br><b>Population:</b>", population.x,
      "<br><b>Density:</b>", round(density, 2),
      "<br><b>Number of Schools:</b>", schools,
      "<br><b>Number of Access Points:</b>", access_points,
      "<br><b>AP per School:</b>", round(ap_per_school, 2),
      "<br><b>AP per 1000 People:</b>", round(ap_per_1000_people, 2)
    ),
    group = "Districts"
  ) %>%
  # Schools with detailed popups
  addCircleMarkers(
    data = schools_data,
    radius = 4,
    color = "blue",
    popup = ~paste(
      "<b>School:</b>", name,
      "<br><b>Province:</b>", province,
      "<br><b>District:</b>", district,
      "<br><b>County:</b>", county,
      "<br><b>Type:</b>", type
    ),
    group = "Schools"
  ) %>%
  # Access points with detailed popups
  addCircleMarkers(
    data = ap_data,
    radius = 4,
    color = "green",
    popup = ~paste(
      "<b>Location:</b>", name,
      "<br><b>Province:</b>", province,
      "<br><b>District:</b>", district,
      "<br><b>County:</b>", county,
      "<br><b>Type:</b>", type,
      "<br><b>Date:</b>", date
    ),
    group = "Access Points"
  ) %>%
  # Add layer controls
  addLayersControl(
    overlayGroups = c("Districts", "Schools", "Access Points"),
    options = layersControlOptions(collapsed = FALSE)
  )

References

Berger, Miriam. 2021. “Myanmar’s Military Built a New Capital as a Haven for Power. Other Countries Have Tried That, Too.” Washington Post, February. https://www.washingtonpost.com/world/2021/02/06/myanmars-military-built-new-capital-haven-power-other-countries-have-tried-that-too/.
Campante, Filipe R., Quoc-Anh Do, and Bernardo Guimaraes. 2019. “Capital Cities, Conflict, and Misgovernance.” American Economic Journal: Applied Economics 11 (3): 298–337. https://doi.org/10.1257/app.20170111.
Sackur, Stephen. 2012. “Equatorial Guinea: Obiang’s Future Capital, Oyala.” BBC News, December. https://www.bbc.com/news/magazine-20731448.